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SECTION  I 
INTRODUCTION 

In  certain  applications  of  multichannel  filtering  techniques,  the 
problem  of  signal  extraction  from  a  c- seismometer  array  reduces  to  one  of 
noise  prediction.  For  example,  if  the  signal  consists  of  vertical  motion  only 
and  the  array  consists  of  a  principal  vertical  seismometer  and  c-1  horizontal 
seismometers,  the  horizontal  seismometers  serve  only  to  predict  the  vertical 
noise  on  the  principal  seismometer.  Thus,  the  horizontal  seismometers  minimize 
the  vertical  noise  power  on  the  principal  seismometer  without  affecting  the  signal 
and  thereby  maximize  the  signal-tc -noise  ratio. 

For  the  more  general  problem,  the  signal  will  be  present  on 
all  seismometers;  however,  if  the  signal  covariance  matrix  between  channels  is 
known,  the  c-channel  outputs  can  be  combined  to  form  a  new  set  of  c  outputs 
(channels)  so  that  the  signal  will  be  present  on  only  the  number  of  channels 
equal  to  the  rank  of  the  signal  covariance  matrix.  Once  in  this  form,  the 
channels  without  signals  are  used  to  reduce  the  noise  power  on  the  channels 
with  signals.  Typically,  the  channels  with  signals  are  assigned  unity  gain 
so  that  the  signal  extracted  from  the  array  outputs  will  be  an  unbiased  esti¬ 
mate.  Maximum  likelihood  processing  under  the  zero-mean  Gaussian  noise 
assumption  takes  this  form:  the  rank-1  signal  is  isolated  on  one  channel  and 
the  remaining  noise- only  channels  are  employed  to  reduce  the  noise  power  on 
the  signal  channel.  This  report  is  restricted  to  rank-1  signals,  although 
higher- rank  signals  can  be  similarly  treated. 

In  this  report,  we  will  require  that  the  signal  processing  be 
unbiased.  This  restriction  fixes  the  processing  of  the  signal  so  that  the 
signal  may  effectively  be  dropped  from  further  consideration.  The  problem 
to  be  treated  now  is  one  of  noise  prediction  only;  c-1  of  the  channels  are  used 
to  predict  the  noise  on  the  signal  channel. 
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After  reducing  the  signal  extraction  to  noise  prediction,  many 
structures  for  predicting  the  noise  on  the  principal  channel  are  possible.  This 
report  will  use  the  following  linear  estimation  technique.  Each  of  the  noise  - 
only  c- seismometer  time  functions  recorded  during  a  time  interval  of  T  sec 
is  individually  Fourier-transformed  at  frequency  points  1/T  apart.  At  any 
particular  frequency,  each  channel  will  measure  a  single  complex  number 
representing  the  amplitude /phase  of  the  noise  on  that  channel  at  that  frequency. 
The  complex  amplitude  on  the  principal  channel  at  that  frequency  can  be  esti¬ 
mated  by  using  a  linear  combination  of  the  c-1  complex  amplitudes.  The  com¬ 
plete  Fourier  transform  of  the  principal  channel  noise  can  be  estimated  if  this 
procedure  is  repeated  for  each  frequency  in  the  Fourier  transforms.  The 
predicted  noise  on.  the  principal  channel  is  then  obtained  by  inverse  Fourier 
transformation. 

At  each  frequency,  the  c-  1  complex  amplitudes  are  combined 
linearly  to  estimate  the  complex  amplitude  of  the  principal  channel.  This 
weighting  of  the  channels  is  a  complex  filtering  of  the  channels.  The  minimum 
variance  filter  weights  (for  the  particular  frequency)  would  be  the  Wiener 
filter  weights  based  on  the  channel  noise  covariance  (power  spectra  matrix) 
at  that  frequency.  Typically,  the  noise  covariance  is  different  for  different 
frequencies;  hence,  different  complex  weights  are  required  for  each  fre¬ 
quency.  Since  the  problem  is  identical  at  each  frequency,  the  noise  pre¬ 
diction  problem  will  be  treated  at  one  particular  frequency,  which  is  under¬ 
stood  to  be  any  one  of  the  frequencies  of  the  Fourier  transform. 

In  general,  the  noise  covariance  or  power  spectra  matrix  (at 
the  understood  frequency)  is  unknown  and  must  be  estimated  in  order  to  design 
the  complex  filter.  By  taking  many  T-sec  time  intervals,  a  series  of  samples 
is  obtained  which  can  be  used  to  estimate  the  power  spectra  matrix,  a  cxc 
Hermitian  matrix,  from  which  a  filter  can  be  designed.  Generally,  this 
estimated  filter  will  not  be  the  optimal  Wiener  filter  but,  hopefully,  it  will  have 


a  mean  absolute  square  error  in  predicting  the  complex  amplitude  of  the 
principal  channel  noise  reasonably  close  to  the  optimal  Wiener  filter  per¬ 
formance.  The  ratio  of  the  performance  of  the  estimated  filter  to  that  of 
the  Wiener  filter  indicates  exactly  how  the  estimated  filter  compares  to  the 
optimal  filter.  If  the  ratio  is  near  unity,  the  estimated  filter  is  almost  as 
good  as  the  Wiener  filter  and  is  satisfactory.  If  the  ratio  is  much  greater 
than  unity,  considerable  improvement  in  the  performance  is  possible  by 
using  more  samples  to  design  the  estimated  filter. 

The  relative  performance  (ratio)  of  the  estimated  filter  is 
a  random  variable  dependent  on  the  noise  samples.  The  probability  density 
of  the  ratio,  however,  is  independent  of  the  actual  noise  covariance  if  the 
complex  amplitudes  of  the  noise  are  jointly  zero-mean  Gaussian.  This  fact 
is  a  fortuitous  result  for  filter  design.  Generally,  the  noise  power  spectra 
matrix  is  unknown;  this  is  the  reason  for  estimating  the  spectra  in  the  first 
place.  Yet,  the  statistics  of  the  performance  ratio  are  calculable  because 
the  ratio  is  independent  of  the  unknown  noise  covariance.  By  knowing  the 
statistical  properties  of  the  performance  ratio,  the  designer  knows  before 
any  data  are  taken  how  well  (relative  to  the  optimal  Wiener  performance)  the 
estimated  filter  is  likely  to  perform.  Before  any  experiment  is  conducted,  the 
relative  expected  performance  can  be  determined;  the  designer  has  prior  knowl¬ 
edge  as  to  whether  the  experiment  will  yield  statistically  accurate  results.  As 
an  example  of  the  type  of  information  available,  an  estimated  filter  for  a  5- 
channel  array  based  on  31  independent  samples  will  perform  (with  90-percent 
confidence)  within  1  db  of  the  optimal  (but  unknown)  Wiener  filter.  The  pro¬ 
perties  of  this  performance  ratio  are  derived  and  developed  in  the  text. 

A  quantity  which  is  available  after  the  experiment  is  the  re¬ 
gression  error  or  prediction  error  of  the  estimated  filter  on  the  design  samples. 
This  quantity  is  an  estimate  of  the  performance  of  the  estimated  filter.  The 
ratio  of  regression  error  to  optimal  Wiener  error  is  also  statistically  inde¬ 
pendent  of  the  noise  power  spectra  matrix;  thus,  the  statistical  properties 
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of  this  regression- error  ratio  permit  the  designer  to  determine  how  well  the 
regression  error  is  likely  to  estimate  the  prediction  error  of  the  estimated 
filter.  Generally,  the  regression  error  has  some  false  gain;  the  properties 
of  the  regression-error  ratio  indicate  the  likely  spread  of  the  false  gain. 

Section  II  formulates  the  mathematical  array  prediction  problem. 
Section  III  proves  the  invariance  of  the  relative  performance  ratio  and  the  re¬ 
gression-error  ratio  with  respect  to  the  actual  noise  covariance  matrix. 

Section  IV  states  the  statistical  properties  derived  in  Appendix  A,  Section  V 
gives  some  applications  of  the  results  to  a  filter  design  problem,  and  Section 
VI  presents  a  matrix  generalization  of  the  invariance  theorem. 
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SECTION  II 

MEAN-SQUARE  ESTIMATION  FORMULATION 

The  source  samples  (a  single  bar  under  a  variable  indicating 
a  column  vector,  and  a  double  bar  indicating  a  square  matrix)  are  assumed  to 
be  complex  Gaussian  random  vectors  satisfying  the  following  definition,  where 
t  denotes  conjugate  transpose  and  |A|  is  the  determinant  of  the  matrixA. 

A  complex  Gaussian  random  vector  x  of  c  dimensions  has  a 
probability  density  (assuming  zero  mean)  of 


P 


(2) 


1 


(2-1) 


where 


(2-2) 


Goodman  explains  the  relationship  between  the  c- dimensional 
complex  vector  just  defined  and  a  2c -dimensional  real  Gaussian  random 
vector. 


Consider  the  problem  of  designing  a  linear  filter  f  (dimension 
c - 1 )  which  predicts  or  estimates  x^  (the  first  component  of  x)  from  a  linear 


i 

combination  of  the  other  c-1  variables  x 

(1)  . 

mating  x  is 


(2) 


,  x 


.(c) 


The  error  in  esti- 
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In  weighting  the  elements  of  :c  ,  the  complex  conjugates  of  the 

elements  of  ^  are  used.  However,  this  difficulty  is  more  than  offset  by  the 

nota1  ional  simplification  of  the  definitions  used.  The  mean- square  error 

2 

a  (f )  of  any  filter  f  can  be  written 


E[|e|2]  =  02(O  =  [l,  -f*]  J 


performance  of  filter  f  (2-3) 


The  linear  filter  f  which  produces  the  minimum  mean-square 
error  c  is  the  Wiener  filter  given  by 


1 


E 


-f 

— o 


(2-4) 


Assuming  that  E  is  unknown  but  that  n  independent  samples 

(x. ,  ....  x  )  are  available,  an  estimate  of  <-be  covariance  matrix  can  be 
-i  -n  ^ 

obtained  as  the  sample  covariance  E  : 


/\ 

E 


_1 

n 


n 


12  ik2k* 


k=l 


(2-5) 
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solution  of 


/\ 

An  estimate  f 


of  the  true  optimum  filter  f  is  obtained  as  the 

-o 


r,  i 

r  1 1 

i 

a 

/\ 

= 

-f 

0 

_  mm 

- 

(2-6) 


/\ 

2 

where  o  is  the  regression  or  prediction  error  of  the  estimated  filter 
operating  on  the  n  design  points;  i.  e.  ,  the  quantity  a  is  the  estimated 
performance  of  the  estimated  filter. 

The  true  performance  of  the  estimated  filter^  is  obtained  by 
inserting^  from  Equation  (2-6)  into  Equation  (2-3).  This  performance  is 
defined  as 


(2-7) 


The  true  estimated-filter  performance  a  *s  a  real  positive 

random  variable  dependent  on  the  n  sample  points  and  the  covariance  T, . 

2  = 

Since  the  estimated  filter  cannot  perform  better  than  the  optimal,  cr  is 

2  2  2 

greater  than  or  equal  to  cr  .  The  ratio  of  a  to  <jQ  is  a  useful  quantity  for 
the  filter  designer.  This  ratio  is  defined  as 

2 

a  =  *  1  (2-8) 

a 

o 
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If  a  is  close  to  unity,  the  designer  knows  that  the  estimated 
filter  is  nearly  as  good  as  the  optimal  and,  therefore,  is  a  well-designed 
filter.  On  the  other  hand,  if  ol  is  much  greater  than  unity,  the  implication 
is  that  the  estimated-filter  performance  can  be  significantly  improved  by 
using  more  samples  in  its  design.  Actually,  a  can  never  be  measured  or 
calculated  if  the  actual  noise  covariance  matrix  is  unknown.  However, 
surprisingly  enough,  for  the  Gaussian  problem,  the  statistical  properties  of 
a  are  independent  of  the  actual  noise  covariance  matrix  T. .  In  fact,  the 
probability  distribution  of  a  depends  only  on  the  number  of  channels  and  the 
number  of  samples  used  to  design  the  filter.  Since  &  measures  how  near  the 
estimated-filter  performance  is  to  the  optimum  and  since  the  statistics  of 
a  are  independent  of  a  particular  noise  covariance  matrix,  the  filter  designer 
can  determine  in  advance  the  amount  of  data  required  to  obtain  a  well-designed 
filter  for  a  c-channel  problem.  For  example,  if  the  designer  has  enough  data 
so  that  a  is  less  than  1.  05  with  a  probability  of  0.  99,  he  knows  that  the 
estimated-filter  performance  will  almost  certainly  be  within  5  percent  of  the 
unknown  optimal  performance. 

Another  important  random  variable  is  the  ratio  of  the  regression 
error  to  the  optimal  Wiener  error.  This  ratio  is  defined  as  0,  where 

/N 

2 

0  .  -2-  a  0  (2-9) 

u 

a 

o 

Again,  this  ratio  is  independent  of  the  noise  covariance  E  . 

In  designing  the  estimated  filter,  the  regression  error  can  be  calculated. 

This  quantity  is  the  prediction  error  of  the  estimated  filter  or,  the  design 
points  and  is  an  estimate  of  the  ei  timated  filter's  mean- square  error.  The 
relative  regression  error  0  indicates  how  close  the  regression  error  is  to  the 
optimal  Wiener  error.  Most  of  the  time,  0  <  1,  indicating  a  false  gain,  but 
there  is  the  possibility  that  0^1. 
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their  ratio, 


Since  0  and  a  are  also  statistically  independent  of  each  other, 

/\ 


also  is  independent  of  Y.  and  gives  an  intuitive  idea  of  the  relation  between 
the  regression  error  and  the  true  error  of  the  estimated  filter. 


In  summary,  the  quantity  a  is  useful  in  determining  how 
close  to  optimum  an  estimated  filter  ca.n  be  expected  to  perform.  The 

quantity  0  indicates  the  reliability  of  the  regression  error  as  a  measure  of 

2  .  2 
a  and  a  . 
o 
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SECTION  III 

INVARIANCE  PROPERTIES  OF  a  AND  3 


Expressions  for  a  and  3,  which  are  independent  of  the  noise 

/\ 

2  2 

covariance  S  ,  are  obtained.  The  quantities  a  and  o  are  linearly  dependent 
2 

only  on  0q  and  have  weights  a  and  3,  respectively. 

2 

To  simplify  the  expression  for  0  ,  let 

E  =  AAl  (3-1) 

where  A  is  arbitrary  except  that  a  ^  ^  0  and  the  remainder  of  the  first  column 
of  A  is  0;  e.g.  ,  A  could  be  chosen  upper  triangular.  There  is  no  unique  A 
satisfying  these  conditions.  A  is  nonsingular  because  7  is  assumed  non¬ 
singular. 

The  Gaussian  random  vector  x  can  be  visualized  as  being  derived 
from  another  Gaussian  random  vector  with 


x  =  A  l 


(3-2) 


and 


identity 


(3-3) 


It  follows  that,  if 


/\ 

I 


n 


i  £ 

k=l 


L  £ 


■k  -*k 


(3-4) 
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o 


then,  from  Equation  (2-5), 


/s 

5 


/\  t 
A  I  A 


(3-5) 


The  expression  (2-6)  for  the  estimated  filter  f  becomes 


r  a  i 

2 

1 

— 

°  l 

/\ 

-f 

0 

tm  — 

(3-6) 


Premultiplying  by - sr  A 

3  i  i  = 


11 


A-1  in  Equation  (3-6)  and  grouping 
/\ 


the  factors  to  define  the  complex  (c-  1)  dimensional  T  yields 


(3-7) 


Observe  that  Equation  (3-7)  is  a  canonical  filter  equation  for  a 

different  problem,  viz.  ,  estimating  ^  from  the  other  c-1  variables 

,  ,  ,  ,  Equation  (3-7)  is  in  the  canonical  form  due  to  the  particular 

choice  of  the  form  of  A. 
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Since  the  are  uncorrelated,  the  optimal  filter  T  for 

— o 

this  different  problem  is 


r  =  o 

—  o  — 


(3-8) 


/\ 


Using  Equation  (3-7)  as  the  definition  of  T,  the  true  n  rformance 
2  /\  /\ 
a  of  the  estimated  filter  f  in  Equation  (2-7)  is  related  to  T  by 


/\ 


/\ 


a  =  la 


IT 


i  +  r  r 


(3-9) 


The  performance  e  the  optimal  filter  f  is 


a2  =  |a  |  (l  +  r*  r 

o  it  v  —  o  — o 


=  a 


11 


(3-10) 


so  that 


a  = 


=  i  +  r  r 


(3-11) 


Using  Equation  (2-9),  Equation  (3-7)  becomes 


/\ 

I 


/\ 

-r 


(3-12) 


which  is  independent  of  E  ,  implying  that  a  and  8  do  not  depend  on  the  original 

t  2  2 

source  covariance.  The  fact  that  |a,  |  =  a  is  also  obvious  from  the 

1 1  o 

definition  of  A  in  Equation  (3-1). 
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In  this  section,  the  estimation  problem  with  covariance  £  has 
been  transformed  to  the  same  problem  with  covariance  1^.  The  problem 
statistics  remain  the  same  only  because  the  transformation  is  linear  and  the 


noise  is  Gaussian.  If  the  noise  were  non- Gaussian,  the  same  transformation 
would  yield  the  £^'s  with  covariance  _I,  but  the  statistics  of  the  £_^'s  would 
not  necessarily  be  tractable.  Under  the  Gaussian  assumption,  all  covariances 
£  have  identical  statistics  for  a  and  0;  hence,  the  densities  of  a  and  8  are 


independent  of  £  . 


Ill- 4 


srcisncs  sorvicos  division 


SECTION  IV 

PROBABILITY  DENSITIES  OF  a  AND  B 

After  the  original  problem  with  covariance  E  is  reduced  to 
the  same  problem  with  covariance  I_,  the  joint  probability  density  of  a  and  B 
follows  (Appendix  A)  from  the  results  in  Goodman.  a  and  B  are  independent, 

with  densities  given  in  Table  IV- 1.  The  situation  for  n  <  c  involves  a  de- 

/\ 

generate  £  and  is  not  treated  here. 

2 

The  densities  for  a  and  Bare  easily  related  to  y  random 

2 

variables.  The  probability  density  of  2nB  is  X  with  2(n-c+l)  degrees  of 
2  2  2 

freedom.  If  y^  and  y^  are  independent  y  variables  with  =  2(c-l)  and 
V  =  2(n-c+2)  degrees  of  freedom,  the  random  variable 

L* 


1 


has  the  same  density  as  a  (Table  IV- 1).  The  quantity 


2, 

Vvi 

2. 

^2 


(a  -  1) 


is  F-distributed.  The  quantity  l/a  is  B-distributed. 

These  distributions  are  common  and  tabulated  in  the  NBS 

2 

Handbook  of  Mathematical  Functions . 
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Table  IV- 1 

PROBABILITY  DENSITIES  OF  a  AND  3 


..  ,  t  estimated-filter  error 

2 

a 

optimal-filter  error 

2 

a 

o 

Density 

n!  (a  -  1)C”2 

(n-c+1) !  (c-2) !  n+1 

a 

Region 

a  S  1  ,  n  ^  c  a  2 

Mean 

n 

n-  c+1 

Variance - — L - 

(n-c+1)  (n-c) 


3  =  ratio  of 


regression  error  of  estimated  filter 
optimal-filter  error 


/\ 

2 
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If  the  random  vector  x  is  actually  real  Gaussian  with  covariance 
the  densities  for  a  and  0  are  only  slightly  altered.  In  the  density  for  a 
{Table  IV-  1),  ifn+lis  replaced  by  ~  and  if  c-  1  is  replaced  by-—-,  the 
resulting  density  is  the  density  for  a  if  x  is  a  real  Gaussian  vector.  Similarly, 

1  is  replaced  by  —  and  n  by  —  in  p(0),  the  new  density  is  correct  for 
3  if  x  is  real.  These  changes  follow  from  a  direct  evaluation  of  p(a,P)  for  x 
real  and  are  not  necessarily  obvious. 
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SECTION  V 


APPLICATIONS 

The  follow  is  an  example  to  demonstrate  how  the  results  in 
Section  IV  are  applied. 

An  array  of  1 1  horizontal  seismometers  is  used  to  predict 
the  noise  on  the  principal  vertical  seismometer.  The  vertical  signal  inci¬ 
dent  on  the  array  may  be  ignored  in  the  analysis;  the  array  predicts  the 
noise  and  not  the  signal  on  the  principal  seismometer.  The  noise  crosspower 
values  between  the  12  channels  are  unknown  and  must  be  estimated  from  mea¬ 
sured  Fourier  transform  data;  80  noise  samples  per  frequency  are  available. 

A.  PROPERTIES  OF  a 

The  performance  ratio  a  represents  the  mean- square  error 
of  the  estimated  filter  relative  to  that  of  the  Wiener  filter.  If  a  is  near  unity, 
the  estimated  filter  performs  almost  as  well  as  the  optimal  filter.  Statis¬ 
tically,  the  value  of  a,  obtained  from  a  particular  sample  of  n  design  points, 
depends  on  how  well  these  n  points  represent  the  true  covariance  matrix. 

The  properties  of  a  are  now  determined  for  the  above  seismic  example. 

1.  Average  Behavior 

A  filter  designed  for  the  above  example  performs  on  the 

average, 

E[a]  =  - ^—r  =  1.159  =  0.  64  db 

*■  J  n  -  c  +  1 

times  the  optimal  minimum  mean- square  error. 
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The  standard  deviation  gives  some  idea  of  the  possible  spread 
to  expect.  A  random  variable  (with  a  reasonably  high  probability)  can  be 
said  to  be  within  two  standard  deviations  of  its  mean.  For  this  example, 


standard  deviation 


n(c-l) 


(n-c+1)  (n-c) 


0.  052 


Therefore,  the  performance  of  the  estimated  filter  is  (with 
high  probability)  between  1.  055  and  1.263  times  the  optimal  error.  More 
precise  statements  can  be  made  by  considering  the  exact  density. 

2.  Distribution  Function 

The  integrated  probability  density  or  distribution  of  a  indicates 
the  manner  in  which  a  can  be  spread.  Figure  V-l  shows  the  plot  of 


p(ao) 


p(a)  da 


=  p  [a  *  aj 


for  n  =  80  and  c  =  12. 

The  exact  probability  of  being  within  two  standard  deviations 
can  be  calculated  (using  Figure  V-l)  as 


P  [1.055  <  a  s  1.263]  =  P  (1. 263)  -  P(  1.  055)  =  0.  96 
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Other  useful  information  can  be  read  off  the  distribution 
function.  For  example,  the  estimated- filter  performance  will  be  no  more 
than  1.2  times  the  optimal  80  percent  of  the  time.  The  estimated  filter 
will  be  better  than  1.  1  times  the  optimal  filter  only  10  percent  of  the  time. 
Seldom  will  the  estimated-filter  performance  be  1.05  times  or  less  than 
the  optimal. 

3.  Density  Function 

Figure  V-2  plots  the  probability  density  of  a  as  a  function  of 
the  number  of  samples  used  to  estimate  the  covariance  matrix.  The  density 
is  quite  broad  for  n  =  20,  indicating  that  a.  probably  will  be  much  greater  than 
unity.  Thus,  the  estimated-filter  performance  is  likely  to  be  several  times 
the  Wiener  filter  performance.  As  n  increases,  the  density  oecomes  more 
peaked  and  approaches  unity,  indicating  the  improvement  gained  by  increasing 
the  number  of  samples  used  to  design  the  filter. 

B.  CONFIDENCE- LEVEL  PLOTS 

Another  presentation  of  the  properties  of  a  is  made  in 
Figures  V-3  through  V-6.  The  plots  are  of  the  number  of  samples  neces¬ 
sary  if  estimated-filter  performance  is  to  be  within  2  db,  1  db,  0.  5  db, 
or  0.  1  db  of  the  W'iener  filter  performance  for  confidence  levels  of  50  percent, 
80  percent,  90  percent,  and  99  percent.  If  c  =  12  (as  in  the  example),  n  =  80 
samples  is  more  than  enough  for  90 -percent  confidence  of  being  within  1  db 
of  optimal  (Figure  V-5).  For  a  given  confidence  level  and  relative  per¬ 
formance  (in  decibels),  these  plots  specify  the  number  of  samples  needed. 

C.  APPROXIMATIONS 

Asymptotically,  the  tail  of  the  a  density  approaches  the  tail 
of  the  normal  density.  Empirically,  in  levels  less  than  1  db  and  n  >  5c, 
there  is  very  little  error  in  calculating  Figure  V-3  by  assuming  tha'  a  is 
normally  distributed  with  mean  and  variance  as  given  in  Table  IV- 1. 
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re  V-2.  Probability  Densities  of  a  for  c  =  12  Channels 
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N  SAMPLES 


N  SAMPLES 


N  SAMPLES 


Figure  V-6.  99-Fercent  Confidence  Plot 


^ - 

The  90-percent  level  corresponds  to  no  more  than  1.282 
standard  deviations  to  the  right  of  the  mean.  Thus,  for  n  >  5c,  the  1-db 
line  is  approximately 


1.26 


n 

n  -  c  +  1 


+ 


1.  282 


n  (c 


(n  -  c  +  1) 


i) _ 

(n-c) 


This  type  of  approximation  can  be  used  for  most  cases  which 
are  not  given  explicitly  in  the  figures. 

D.  PROPERTIES  OF  8 

The  relative  performance  of  the  estimated  filter  is  indicated 
by  abut  cannot  be  measured.  The  regression  error,  on  the  other  hand,  is 
calculated  during  filter  design  and  relates  to  the  actual  performance  of  the 
estimated  filter.  As  previously  stated,  the  regression  error  (or  false  gain) 
is  the  prediction  error  of  the  estimated  filter  operating  on  the  n  design  points, 
gis  the  ratio  of  the  regression  error  to  the  true  optimal  error. 

If  8  <  1,  the  regression  error  is  less  than  the  optimal  error 

and  represents  false  gain;  however,  if  8  >1,  the  regression  error  is  greater 

2 

than  the  optimal-filter  error.  Note  that  only  the  regression  error  (8a  )  is 
measurable;  8  is  not. 

1.  Average  Behavior 

The  average  8  obtained  in  a  filter  design  is 


E[8]  = 


n  -  c  +  1 
n 


1  - 


c  -  1 


n 


V-10 


setsnes  asrvloss  division 


This  is  less  than  1.0,  indicating  an  expected  false  gain.  An  unbiased 
estimate  of  the  optimal  minimum  mean- square  error  is 

2  n 

unbiased  estimate  of  a  =  -  a*" 

o  n  -  c  +  1 

This  scales  up  the  regression  error  by  its  average  false  gain. 

The  actual  performance  of  the  estimated  filter  is  greater  than 

2  _ 

by  an  average  amount 


F,[a] 


n 

n  -  c  +  1 


Thus,  the  result  will  approximate  the  performance  of  the  estimated  filter 
if  the  regression  error  is  scaled  by 

E[q]  n2 

EM  ‘  (n.c+1)2 

Roughly,  the  estimated-filter  performance  is 

_ _ 

(n  -  c  +  1)^ 

times  the  measured  regression  error.  For  12  channels  and  80  samples,  the 
true  performance  of  the  estimated  filter  is  (on  the  average)  1.  35  times  the 
actual  measured  regression  error. 
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2.  Density  Function 


Figure  V-7  plots  the  densities  of  0  corresponding  to  Ihe 
densities  of  a  in  Figure  V-2.  For  n  =  20,  0  is  likely  to  be  much  less  than 
unity,  indicating  that  the  regression  error  is  likely  to  be  much  less  than  the 
Wiener  filter  error.  Thus,  the  false  gain  is  likely  to  be  great.  As  n  increases, 
the  density  of  0  becomes  more  peaked  near  unity. 

The  designer  can  look  at  these  plots  and  decide  how  many  sam¬ 
ples  are  needed  to  obtain  the  estimated  filter.  Figure  V-2  indicates  the  spread 
in  relative  performance  to  be  expected,  and  Figure  V-7  indicates  the  spread 
in  regression  error  (false  gain)  to  be  expected. 

As  stated  earlier,  a  and  0  are  independent.  Even  if  a  =  1 

(i.  e.  ,  estimated  filter  =  optimal  filter),  0  has  the  same  density  and  is  no 

2  2 

better  an  estimate  of  a  .  In  fact,  to  estimate  a  from  the  regression  error 

o  o 

with  the  optimal  filter  requires  only  c-1  fewer  samples  (for  equivalent  esti- 
2  2 

mates  of  CTq  )  than  to  estimate  0q  without  the  optimal  filter.  This  is  analogous 
to  estimating  a  Gaussian  variance  with  and  without  the-  mean.  Without  the 
mean,  one  additional  sample  is  required  for  equivalent  estimates  of  the 
variance. 

E.  RESTRICTIONS  AND  EXTENSIONS 

This  report  assumes  throughout  that  n  independent  samples  are 
available  for  the  estimated  filter.  If  the  samples  are  actually  correlated,  the 
effective  n  will  be  less  than  the  number  of  samples.  Analysis  for  correlated 
samples  is  needed. 
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SECTION  VI 

GENERALIZED  INVARIANCE  PROPERTIES 


This  section  discusses  a  generalization  of  the  parameters 

oc  and  0  to  matrix  parameter  A  and  B.  Consider  a  complex  random 

variable  x  of  dimension  c  with  covariance  matrix  E  .  The  covariance 

=x 

matrix  can  be  uniquely  represented  by  £_  =  L^  L  where  L  is  lower 

triangular  (see  Lemma  2  of  Appendix  A).  If  the  representation  is  ex¬ 
pressed  in  the  form 


- 1  T 

£  L  D  =  L  D 


(6-1) 


where  D  is  a  diagonal  matrix  consisting  of  the  diagonal  elements  of  L  , 
the  form  is  seen  to  be  a  matrix  generalization  of  Equation  (2-4).  The 
columns  of  L  D  represent  prediction-error  filters  of  varying  length; 
the  covariance  matrix  of  the  filter  errors  is 


-IT  T  - 1 

E  g  (L  )  x  x 1  L  D  =  D  D 


(6-2) 


An  analogous  generalization  can  be  made  for  L,  D  derived  from  the 
estimated  covariance  matrix^  of  Equation  (2-5). 

The  generalization  of  ol  to  A  is  now  achieved  by  normalizing 
the  covariance  matrix  of  the  errors  of  the  estimated  filters 


/\  /\  -  i  T  -  1  /\ 

D  (L  )  Z  L  D 

— X  =X  =X  =X  =X 


to  got 


*  _  -1  ,/\  -  l.T  /\  -  1  /\ 

A  =  D  D  (.w  r  E  L  D  D 

*X  =X  —  X  —X  —X  —X  — 


(6-3) 
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Similarly,  8  generalizes  to 


B  =  D-‘S  (£-')T?  C-'-S  D-1 

=x  =x  =x  —  x  =x  =x  =x  =x 


(6-4) 


The  matrices  A  and  B  are  now  evaluated  for  the  random 

T  -  1  t  -  1 

variable^  =  (L  )  x  and  the  observed  data  ^  =  (L  )  x. . 

First,  note  that 

T  -  1  -  1 

E  =  (L  )  E  L  =  I 

=y  =x  =x  =x  = 


and 


(6-5) 


/\  T  - 1  ^  -1  T  - 1 A  T  A  -1 

E  =  (L  )  E  L  =  (L  )  L  1  L  L 

ssy  “x  =x  —  x  5;x  “x  « x 


so  that 


and 


L  -  £  j_,  " 1 

ssy  =x  =x 


/\  /\  - 1 

D  =  D  D 

=y  =x  =x 


These  relationships  are  now  used  to  express 


A 

=y 


D 

=y 


(L  -1)T (L  ) 
=y  =y 


-l 


D 

=y 


(6-6) 


(6-7) 


in  the  form 


A 

=y 


fi  (C-‘)T 

=x  =x 


,  - 1,  /\ 

L  (L  )  D  D 
=x  =x  =x  =x 


(6-8) 


Similarly,  B  =  B  ,  so  the  distribution  of  the  matrices 
=y  =x 

A  and  B  is  independent  of  the  covariance  of  the  vector  for  which  they  are 
defined. 
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APPENDIX  A 

/\ 

DERIVATION  OF  PROBABILITY  DENSITIES  OF  V,  a,  AND  0 


APPENDIX  A 

DERIVATION  OF  PROBABILITY  DENSITIES  OFT,  a,  AND  0 


/\ 

The  text  equations  specifying  the  fiiter_T  (c-1  dimensions) 
and  the  ratios  a  and  0  are 


/\ 

I 


/\ 

a  =  1  +Tt 


/\ 

r 


(A  - 1) 


and 


I 


sample  covariance 


t 

k 


where  £  are  complex 
with  covariance 


Gaussian  random  vectors  (zero-mean, 


c-dimensional) 


E 


I 


A- 1 
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Prior  to  evaluating  p(a,$),  the  required  results  from  Goodman 


are  given. 


•  Lemma  1 


If  x.i  ....  x  are  independent  samples  from  a  complex  Gaussian 
— 1  — n 

(zero-mean,  c-dimensional  for  c  £  n)  source  with  a  c ^variance 


then, 


E  W  -k]  =  5 


B  =  V  x  x* 
.  ,  — i  — i 
1=1 


(A-2) 


has  a  probability  density 


P(B)  =  P(B11,B12''’,B1c’B22’B23",,Bcc) 


(A -3) 


F  (n,  c,  E) 


j^-Tr  (E_  1  B)| 


where 


F(n,c,E)  =  TT^c(c'1)  (n- 1)  !  ...(n-c)  !  |||n 


(A- 4) 


The  density  above  is  defined  on  positive  semi-definite  Hermitian 
matrices  B.  The  density  is  the  complex  Wishart  distribution. 
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•  Corollary 


If  A  and  B  are  (cxc)  Hermitian  and  A  exists,  then 


f...f  dB  |  B  | n" C  exp[-Tr(AB)J  =  F(n,  c,A_1)  (A-5) 


where  B  is  positive  semi-definite  and 

dB  —  dB .  .  dB . _ . .  .  dB .  dB_ _ .... dB 
=  11  12  lc  22  ec 


•  Lemma  2 

If  B  is  a  (cxc)  complex  random  matrix  satisfying  the  Wishart 
distribution  in  Lemma  1,  there  is  a  unique  lower  triangular  complex  matrix  L 
with  positive  real  diagonal  elements  L^,  .  .  .  ,  L  ,  80  that 

L*  L  =  B 


and 


p(L)  =  p(l11*l21*---;lc1';L22’”,:Lc2’1j33’,,':Lcc) 


(A-6) 


2n-  1  2n-  3 

L  Li 


F  (n,  c,  E)  cc  c- 1 ,  c-  1 


.  L 


2n- 2c+l 
11 


exp^-Trqf1  L*  L)J 


where  L^  is  real  and  positive,  L^(j>k)  is  complex,  and  L^(j  <k)  =  0. 
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Lemma  1  permits  the  writing  of  the  probability  density  of 
A 

the  matr  4 LA  _I  .  The  analysis  proceeds  by  changing  variables  first  to  L 
(lower  triangular  matrix)  and  then  to  a  set  of  variables  including  T.  Finally, 
by  integrating  out  all  unwanted  variables,  p(T  ,  g)  remains.  The  joint  density 
p(ttig)  follows  easily  by  a  final  change  of  variables. 


From  Lemma  1,  B  is  just  n  times  the  sample  covariance. 
Therefore,  defining 


B 


n  I 


and 

L1  L  =  B 


then, 


_1 

n 


L*  L 


(A-7) 


/s 


Since  L  is  lower  triangular,  L  is  also  lower  triangular.  The 


definition  of  T.  then  becomes 


and,  defining  g  in  terms  of  L, 


(A-8) 


(A-9) 
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l\)  /\ 

To  make  the  change  of  var-'  ibles  .  . .  »  to  r  ,  .  .  .r(c-l)  but  to 

retain  the  other  elements  of  L,  we  partition  L  as 


Lu  ° 


where 


b  = 


(A-  10) 


The  partition  in  Equation  (A- 10)  implies  that  the  equation 
defining  the  cnange  of  variables  is 


b  -  M  r  =  0  (c- 1  dimension) 


(A-  11) 


The  Jacobian  of  this  transformation  is 


,  2  2 
L22  L33 


»  (%'.  b<!»,  b^2» . b^1') 

773(1)  (i)  (2)  rXc-m 

*{rR  >ri  >1r . Ai  ; 


(A- 12) 
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The  density  before  the  transformation  (Lemma  2)  is 


,, t  2  „2n-l  „  2n-2c+l 

P(L)  *  - - rr  L  ...L, 


zz  F (n, c,l)  cc 

and  after, 


11 


ex*.  j^-Tr  (L*  L)j 


(A- 13) 


p  (Ln.r,M) 


L  2  1 

2n-2c+l  11  |  1 2  _  2n-^  1 

ioTZI j  Ln  e  'Ml  Lcc 


L2n-2C+3  (A14) 


-Tr  [^+f  T1)  M1m] 


Equation  (A- 6)  of  Lemma  2  is  used  to  integrate  over  M.  The 
resulting  margLial  density  of  (L  ,  F)  is 


-L 

_  _  2n-2c+l  1 1 

~  2LU 


P<hr?>  -  ^UF(n— ) -  F(n+1,  c-1,[i  +rf*]'1)  (A- 15) 


The  final  transformation  0  = 

A  n 

0  and  r : 


11 


obtains  the  density  for 


P  (3 1 


‘  n-c+1  „n-c  -n0" 
n  0  e 

n  ! 

1 

(n-c)  ! 

.  ttC_1  (n-c+1)  ! 

(l+|f!')n+1J 

(A- 16) 


=  p(P)  P(f) 


for  0  s  0,  r  complex 
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Thus,  0  and  r  are  statistically  independent. 


The  density  for  the  quantity 

/s  vs  2 

a  =  1  +r'  r  =  1  +  irr 


can  now  be  determined  as 


p(a) 


n  ! _  (a  -  1)C^ 

(n-c+1)  !  (c-2)  1  n+1 

a 


for  a  £  1 ,  n  ^  c  (A-  17) 
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APPENDIX  B 
SUMMARY  OF  NOTATION 


Z 

z 

£ 


f 

-o 


/\ 

f 


2 

a 

o 


2 

CT 

/\ 

2 

a 

o 


a 


0 


c-dimensional  complex  Gaussian  vector  representing 
Fourier  transforms  of  c  channels  in  an  array 

First  component  of  x;  principal  channel  being  estimated 
by  the  other  c- 1  channels 

c  by  c  positive  definite  covariance  matrix  of  channels; 

Z  =  Ejx  x1} 

Sample  covariance  of  n  independent  samples  x^.  (k  —  1»  •  .  .  »n) 
from  the  array;  estimate  of  true  covariance  Z 

(c-1)  dimensional  filter  used  to  weight  the  c- 1  channels  to 
form  an  estimate  of  xf1) 

Optimum  filter;  minimum  mean-square-error  filter; 

Wiener  filter 

Estimated  filter;  filter  designed  on  basis  of  n  independent 
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IS.  ABSTRACT 

In  designing  a  digital  multichannel  filter  from  a  limited  sample  of  noise, 
a  highly  important  parameter,  a,  is  defined  as  the  trua  mean- square  error  of  the 
estimated  filter  (i.  e.  ,  the  average  long-term  performance  of  the  filter  obtained 
from  the  noise  sample)  divided  by  the  true  mean-square  error  of  the  optimum 
filter.  f 

The  value  of  a,’  which  is  equal  to  or  greater  than  one,  is  not  known  before 
or  after  an  experiment  since  the  true  covariance  of  the  data  i^s  required  to 
calculate  its  value;  however,  the  probability  density  of  ’L  turn's  out  to  be  invariant 
with  respect  to  the  true  covariance  and  depends  only  on  the  amount  of  data  and 
the  number  of  channels  in  the  filter.  Thus,  one  can  determine  before  collecting 
any  data  how  long  a  sample  is  needed  in  order  to  design  a  filter  which  is  within 
1  db  (for  example^)  of  optimum  with  90-percent  confidence.  A  second  similar 
parameter,  #,  defined  as  the  estimated  mean-square  error  oi  the  estimated 
filter  (i.  e.  ,  the  regression  error)  divided  by  the  true  mean-square  error  of  the 
optimum  filter  is  highly  useful  in  deciding  the  reliability  of  the  apparent 
effectiveness  of  the  designed  filter.  '’I  -  '  "  ,f‘J  * 

The  probability  densities  of  fit  and  6  are  derived  for  the  Gaussian 
assumption  and  graphs  useful  in  experiment  design  are  presented  in  this  report. ( 
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